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Abstract 

One of the big challenges in the development of probabilistic re- 
lational (or probabilistic logical) modeling and learning frameworks 
is the design of inference techniques that operate on the level of the 
abstract model representation language, rather than on the level of 
ground, propositional instances of the model. Numerous approaches 
for such "lifted inference" techniques have been proposed. While it 
has been demonstrated that these techniques will lead to significantly 
more efficient inference on some specific models, there are only very 
recent and still quite restricted results that show the feasibility of lifted 
inference on certain syntactically defined classes of models. Lower 
complexity bounds that imply some limitations for the feasibility of 
lifted inference on more expressive model classes were established 
early on in (Jaeger 2000). However, it is not immediate that these re- 
sults also apply to the type of modeling languages that currently re- 
ceive the most attention, i.e., weighted, quantifier-free formulas. In this 
paper we extend these earlier results, and show that under the assump- 
tion that NETIMEt^ETIME, there is no polynomial lifted inference 
algorithm for knowledge bases of weighted, quantifier- and function- 
free formulas. Further strengthening earlier results, this is also shown 
to hold for approximate inference, and for knowledge bases not con- 
taining the equality predicate. 



1 Introduction 

Probabilistic logic models (a.k.a. probabilistic or statistic relational models) provide 
high-level representation languages for probabilistic models of structured data. One 
broad distinction one can make for the very many different types of models that have 
been proposed is between what one may call process-oriented and constraint-based 
models. In the former, the model essentially represents a generative stochastic process 
for a relational structure. It is defined in terms of prior and conditional probabilities, 
and its semantics can be given in terms of a directed graphical model, e.g., lfT HT8ll2n 
[TtI m |6] [T3] [15] |27J - and many others. In the latter, the model defines a set of soft 
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constraints on a distribution, and the semantics can be given in terms of an undirected 
graphical model 1 24j |20J . 

The problem of finding "lifted" inference techniques for probabilistic logic models 
has received a lot of attention for neariy 10 years now in9]|2] |T6l[T4l[T0l l7l l26ll25l 
|5|- Apart from methods for exact lifted inference, also approximate lifted techniques 
have been investigated ll22l [3l l23l . In most of these works it is demonstrated for some 
particular models that lifted inference techniques provide a significant improvement in 
terms of how inference complexity scales as a function of the size of the model domain. 

On the other hand, ||8l has shown that under certain assumptions on the expressivity 
of the modeling language, probabilistic inference is not polynomial in the domainsize, 
thereby demonstrating some inherent limitations in terms of worst-case complexity for 
the goals of lifted inference. The results of [81 essentially assume a process-oriented 
modeling framework, and the expressivity requirements amount to a probabilistic ver- 
sion of full first-order predicate logic. Since much recent work considers constraint- 
based frameworks, and, more significantly, within these frameworks focuses on frag- 
ments without full first-order expressivity, it is not clear to what extent these earlier 
intractability results are applicable to these ongoing efforts. In particular, much cur- 
rent work is devoted to models defined by constraints expressed by quantifier- and 
function-free formulas, which do not fulfill the requirements of |8 1. In fact, for one lim- 
ited sub-class of such models (defined by the condition that formulas contain at most 
two variables), ||25l has recently proposed a lifted inference technique that guarantees 
polynomial complexity. This appears to be the first result that establishes scalability of 
lifted inference for a whole model class defined by syntactic restrictions. Between this 
positive result, and the earlier negative results, the theoretical complexity boundaries 
for lifted inference are unknown. 

In this paper we refine the complexity map for lifted inference. Extending the gen- 
eral approach taken in [|8], we establish intractability results also for constraint-based 
modeling languages limited to quantifier- and function-free formulas. In a sharp con- 
trast with [8], where a "trivial" constant-time approximate inference method was de- 
scribed, we show that our lower complexity bounds also hold for approximate infer- 
ence. Further sharpening earlier results, we finally establish that the lower complexity 
bounds also hold for models not using the equality predicate, which in |i8J was conjec- 
tured to be the key source of inherent complexity. 

In the following section we briefly review the inference problem for constraint- 
based probabilistic logic models in terms of weighted model counting. Section [3] re- 
views classic results relating first-order logic models to the complexity class NETIME. 
Section m contains our main results, and Section |5]discusses some notable differences 
that emerge between the results for process-oriented and for constraint-based models. 
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2 Weighted Model Counting 



Similarly as ll26l and Q we assume the following framework: a model, or knowledge 
base, is given by a set of weighted formulas: 

4>i{vi) : wi 
KB: '^^(^2) ■ ^2 (1) 

where the are formulas in first-order predicate logic, G M are non-negative 
weights, and Vi = (wi.i, • ■ • , Wi.fc. ) are the free variables of (pi. The case ki — 0, i.e., 
cj)i is a sentence without free variables, is also permitted. The use a given signature 
S of relation-, function-, and constant symbols. 

An interpretation (or possible world) {D, I) for S consists of a domain D, and 
an interpretation function / that maps the symbols in S to functions, relations and 
elements on D. For a tuple d G D'^' then the truth value (f>i{d/vi) is defined, and we 
write {D,I) \= 4>i{d), or simpler / |= (f>i{d) if (j>i{d/vi) is true in {D,I). We use 
I{D, S) to denote the set of all interpretations for the signature S over the domain D. 

In this paper we are only concerned with finite domains, and assume without loss 
of generality that D = Z)„ :={!,..., n} for some n e N. 

For / e I{Dm S) let /) denote the number of elements d in D^^ for which 
I \= (f>i{d). The weight of / then is 

N 



i=l 



where 0° = 1. The probability of / is 

F,f (/) = <«(/)/Z 
where Z is the normalizing constant (partition function) 

Z= Yl <'(^')- (2) 

rex{D„,s) 

For a first-order sentence (j) and n e N then 

P^l'/') :=P({/eI(A.,5)|/h0}) 

is the probability of cf) in X{Dn, S). An inference problem PI{KB, n, 0, ip) is given by 
a knowledge base KB, a domainsize n e N, and two first-order sentences </>, V'- The 
solution to the inference problem is the conditional probability P^^itp \ i')- 

A class of inference problems is defined by allowing arguments KB, cj), and ip only 
from some restricted classes ICB, Q (the query class), and £ (the evidence class), re- 
spectively. In this paper we will only be concerned with the cases where Q consists 
of all ground atoms, denoted AT, and £ is empty, i.e., we are only considering infer- 
ence without evidence. Classes ICB are defined by various syntactic restrictions on the 
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formulas in the knowledge base. In this paper, we consider the following fragments 
of first-order logic (FOL): relational FOL (RFOL), i.e. FOL without function and con- 
stant symbols; 0-RFOL, which is the quantifier-free fragment of RFOL, and O-RFOL'^, 
which is 0-RFOL without the equality relation. 

An algorithm solves a class PI{ICB, N, Q, £), if it computes Pn^{(t> \ ip) for all in- 
stances PI{KB, n, (j), Ip) in the class. An algorithm e-approximately solves PI{ICB, N, Q, £), 
if for any PI{KB, n, cf), ip) in the class it returns a number p e [P,f*((/) \ tp) — e, Pn^ifp \ 
tjj) + e]. An algorithm that solves PI{ICB, N, Q, £) is polynomial in the domainsize, if 
for fixed KB, cj), ip the computation of PI{KB, n, cp, ip) is polynomial in n. 

3 Spectra and Complexity 

The following definition introduces the central concept for our analysis. 

Definition 3.1 Let tp be a sentence in first-order logic. The spectrum ofip is the set of 

integers n G Nfor which ip> is satisfiable by an interpretation of size n. 

Example 3.2 Let ip ~ ipi hip2 /\ tps, where 

ipi = \fx,y u{x,y) ^ u{y,x) 
ip2 = 3y u{x, y) 

ip3 = yx,y,y' {u{x,y) Au{x,y') ^ y = y') 

Ip expresses that the binary relation u defines an undirected graph (ipi) in which every 
node is connected to exactly one other node (ip2,''p3)- Thus, ip describes a pairing 
relation that is satisfiable exactly over domains of even size: spec^ip) = {n \ n even}. 

The complexity class ETIME consists of problems solvable in time 0{2"^), for 
some constant c. The corresponding nondeterministic class is NETIME. Note that these 
classes are distinct from the more commonly studied classes (N)EXPTIME, which are 
characterized by complexity bounds 0(2" ) fTTl. For n S N let bin{n) G {0, 1}* 
denote the binary coding of n, and un{n) E {!}* the unary coding (i.e., n is represented 
as a sequence of n Is). A set C N is in (N)ETIME, iff {bin{n) | n G 5} is in 
(N)ETIME, which also is equivalent to {Mn(rt) \ n e S} being in (N)PTIME. 

Like 1 8 1, we use the following connection between spectra and NETIME as the key 
tool for our complexity analysis. 

Theorem 3.3 U2] A set A C N is in NETIME, iff A is the spectrum of a sentence (p G 
RFOL. 

Corollary 3.4 If NETIME ^ ETIME, then there exists a first-order sentence (p, such 
that {un(n) | n G spec((j!))} is not recognized in deterministic polynomial time. 

Thus, by reducing the spectra-recognition problem to a class of inference problems 
PI{ICB,N, Q,£), one establishes that the latter is not polynomial in the domainsize 
(under the assumption ETIME ^ NETIME). 
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4 Complexity Results 



This section contains our complexity results. We begin with a result for knowledge 
bases using full RFOL. This is rather straightforward, and (for exact inference) already 
implied by the results of [8 |. We then proceed to extend this base result to 0-RFOL 
and O-RFOL^^. 

Theorem 4.1 IfNETIME ^ ETIME, then there does not exist an algorithm that 0.25- 
approximately solves Pl{RFOL, N, AT, 0) in time polynomial in the domainsize. 

The proof of this theorem provides the general pattern also for subsequent proofs. 
It is therefore here given in full. 

Proof: Let be a sentence with a "hard" spectrum as given by Corollarv 13.41 Let S be 
the relational signature of 0. Let a() be a new relation symbol of arity zero (i.e., a() 
represents a propositional variable). The first weighted formula in our knowledge base 
then is 

-(0oa()) : (3) 

We now already have that F,f*(a()) > iff there exists / e I(Ai, 5*) with I ^ (p, 
i.e., iff n 6 spec{(p). This already reduces the decision problem for spec{(l>) to solving 
PI{KB, n, a{), 0) exactly. However, from the 0-1 laws of first-order logic [4|, it follows 
that for our current KB: P.^^{a{)) -^n^oo 0. Thus, for every e > we could define 
an e-approximate constant-time inference algorithm by returning for all sufficiently 
large n. 

In order to obtain our result for approximate inference, we will now ensure that for 
all n G spec{<f>) the probability P^^{a{)) is greater than 0.5. We do this essentially 
by calibrating the normalization constant Z in (|2]i. For this we introduce another new 
relation and add to KB: 

-(( /\ Vx^Rix)) o 60) : (4) 
Res 

Thus, for every n there is exactly one interpretation / G X{Dm S) with nonzero 
weight in which 6() is true (the one in which all relations have empty interpretations). 
Finally, we give zero weight to all interpretations except those in which a() or 6() is 
true: 

-(a()Vfe()) : (5) 

Let KB consist of Q,©,©. Every / G 5) then has weight if it satisfies 

one of the three formulas, and weight 1 otherwise. Consider the case n ^ spec{ip). 
Then, by (O u;^*(a()) = 0. By Q this then means that in all interpretations of nonzero 
weight b{) must be true. By (|4]i there is exactly one such interpretation. Thus, Z in (|2]l 
is l,andPf*(a()) = 0/1 ^ 0. 

If n G spec{(f)), then wfj^{a{)) > 1, and Z = w^^{a{)) (if the interpretation in 
which all R are empty also is a model of 0), oi Z ~ w^^{a{)) + 1 (otherwise). Thus, 
P^^{a{)) > 1/2. A 0.25-approximate inference algorithm for PI{KB,n, a(), 0), thus, 
would decide 5/?ec(0). □ 
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We now proceed towards our main result, which is going from RFOL to 0-RFOL. 
If we wanted to allow function and constant symbols in our knowledge base, then one 
could go to a quantifier-free fragment in a quite straightforward manner using Skolem- 
ization. Since satisfiability over a given domain is the same for a formula (j) and its 
quantifier-free Skolemized version cj)^'"'', the arguments of the proof of Theorem 14.11 
would go through with little change. In order to accomplish the same using only the 
relational fragment 0-RFOL, we define the relational Skolemization of a formula. The 
idea is to replace function and constant symbols in the Skolemized version of a for- 
mula with relational representations. For example, the Skolemized version of ^^2 from 
Example |3.2| is 

V^f'^Vx u{x,f[x)) 
with a new function symbol /(). Introducing a relational encoding of /() leads to 

i^f"^"' =yx,y Rf {x,y) ^ u{x,y) 

with a new relation symbol encoding /(). This translation must be accompanied 
by axioms that confine the possible interpretations of R^ to relations that encode func- 
tions. 

Such relational encodings of functions are well established. However, there does 
not seem to be a standard account of this technique that serves our purpose. The fol- 
lowing proposition, therefore, provides the relevant result in a form tailored for our 
needs. 

Proposition 4.2 Let <j>{x) G 0-FOL{SU S^), where S is a set of relation symbols, and 
a set of function and constant symbols. Let be a set of new relation symbols that 
for every k-ary f £ contains a k + 1-ary R^ ( constant symbols are treated as 0-ary 
function symbols). Let Func be the set of sentences that for every f G contains 

^xyy' {Rf{x,y)AR^x,y')^y = y') (6) 
yx3yRf{x,y) (7) 

Then there exists a formula (j)^{x, z) ^0-RFOL{S U S^), such that the following are 
equivalent: 

i there exists I G X(I?„, S U S^) with I ^ \/x(f){x) 

ii there exists /+ G I{Dn,S U 5+) with /+ |= Func A Vxz (j)+{x,z) 

If (f)^'^"^ is the Skolemization of a formula </> gRFOL, we then call <j)^''°'^ the rela- 
tional Skolemization of (p, written cj)'^'^'"''. 

Our plan, now, is to prove the analogon of Theorem |4T| for 0-RFOL by replacing 
in (O with (/i* However, this is not enough, since we also need to constrain the 
models of our knowledge base (more precisely: those models in which a() is true) to 
satisfy the axioms (|6]l and Q. This poses a problem, because O contains an existential 
quantifier, and so we cannot add this axiom directly as a constraint to a knowledge base 
restricted to 0-RFOL. 
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The solution to this problem is to approximate d?) with a weighted formula 

ai)AR^{x,y) : w (8) 

that rewards models of a() in which the existential quantifier of O is satisfied for 
many (all) £c. We will no longer be able to ensure that u;^*(a()) = Owhenn ^ spec{ip). 
However, by a suitable choice of w, and by a careful calibration of the weight of models 
of the alternative proposition b{), we still can ensure that u;^*(6()) w^^(a()) when 
n ^ spec{(f)), and u;^*(6()) « w'^^{a{)) when n G spec{(j)). This choice of w, however, 
now will have to be a function of n, i.e., we cannot reduce the decision problem of 
spec{(j)) to probabilistic inference for a fixed knowledge base KB, but to probabilistic 
inference for a parameterized knowledge base KB{w{n)). This is why the following 
theorem is a little weaker than the previous one in that it also includes a condition 
on the inference algorithm to be polynomial in the representation size of the weight 
parameters. 

Theorem 4.3 IfNETIME ^ ETIME, then there does not exist an algorithm that 0.25- 
approximately solves Pl{0-RFOL, N, AT, 0) in time polynomial both in the domain- 
size, and the representation size I :— log(^i) of the weight parameters. 

The full proof of the theorem is given in the appendix. 

One may wonder how strong or surprising Theorem |4.3 I reallv is in light of its extra 
polynomial runtime in / condition - especially since it has been emphasized that lifted 
inference procedures should only be expected to be polynomial in the domain size, but 
not in other parameters that characterize the complexity of KB 181 125)1 . These remarks, 
however, have mostly been motivated by considerations of the logical complexity of 
KB, e.g. in terms of the number and complexity of its weighted formulas, or the size of 
the signature. The complexity in terms of numerical parameters, on the other hand, has 
not received much attention. 

To better understand the nature of the polynomial in I condition, we consider in- 
ference algorithms that can be described as follows: to compute PI {KB, n,(l),ip) the 
algorithm performs a number of steps, where step i either consists of executing a con- 
stant time operation that does not depend on the numerical model parameters (e.g., a 
logical operation on formulas), or of an operation on numbers k, I e V{i), where V{i) 
is the set of all numerical variables (orignal weight parameters, intermediate computed 
results, . . . ) stored by the algorithm at step i. Further assume that the values of the 
weight parameters in KB only affect the numerical values of the variables in V{i), but 
not the sequence of execution steps performed by the algorithm, and, in consequence, 
not the set of variables contained in V{i). Also assume that an operation performed on 
k, I G V{i) is linear in the representation size of k and I (as is the case for basic addition 
and multiplication operations). Then the representation size of the values in V{i), and 
the execution time of a numerical computation step will always be a linear function of 
the size of the original Wi parameters, and the algorithm is polynomial (linear, in fact) 
in I. Since most existing lifted inference algorithms fit this description, it seems that 
the condition of being polynomial in I is not a severe limitation on the applicability of 
Theorem 14. 3 1 
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In a final strengthening of our results, we now move on to the fragment 0-RFOL^. 
The availability of the equality predicate for the formulas of KB, so far, has been an 
important prerequisite for our arguments, because Theorem 13.31 crucially depends on 
equahty: spectra for formulas (h e RFOL^ are always of the form N \ {1, . . . , A;} for 
some k, and, thus, decidable in constant time. For this reason it was suggested in fS) 
that one should focus on logical fragments without equality when looking for model 
classes for which lifted inference scales polynomially in the domainsize. As our final 
result shows, however, elimination of equality may not have such a large impact on 
complexity, after all. 

Theorem 4.4 IfNETIME ^ ETIME, then there does not exist an algorithm that 0.25- 
approximately solves Pl{0-RFOL^ , N, AT, 0) in time polynomial in n and the repre- 
sentation size I := X]i=i log(''^0 of the weight parameters. 

This theorem is a generalization of Theorem |4.3l and, strictly speaking, makes [43] 
redundant. It is only for expository purposes, and greater transparency in the proof 
arguments, that we here develop these results in two steps. 

The proof of Theorem I4.4l is a refinement of the proof of Theorem 14.31 In addition 
to approximating Skolem functions / with relations , we now also approximate the 
equality predicate = with a binary relation E{-, ■). Similarly as we could not impose 
in 0-RFOL hard constraints that ensure that encodes a function, we also cannot 
constrain models to exactly interpret E as the equality relation. However, in analogy to 
(ISI we can approximate true equality using the two weighted formulas 

a{)A^E{x,x) : (9) 
a{)AE{x,y) : 1/w (10) 

where w is a large weight. Any model that does not interpret E as the equality relation, 
then incurs a penalty of at least l/w. 

5 Approximate Inference and Convergence 

There are some notable differences with respect to approximate inference between the 
results we here obtained for weighted model counting, and the results of fSl, where 
it was shown that due to convergence of query probabilities as n oo, in theory a 
trivial constant time approximation algorithm exists: perform exact inference for all 
input domains up to a size n*, and output the limit probability for all domains of size 
> n*. This "algorithm", however, has no practical use, since for a desired accuracy 
value e one first would have to determine a sufficiently high threshold value n* G N to 
make the output indeed be an e-approximation. 

Nevertheless, the difference between the existence of an impractical approximation 
algorithm on the one hand, and the non-existence of any approximation algorithm on 
the other hand, is just one consequence of a more fundamental difference: while in the 
models considered in |8| query probabilities P„(a()) converge to a limit, this is not 
the case for knowledge bases of weighted formulas - even under the restriction to 0- 
RFOL: in the proofs of Section|4]we have constructed knowledge bases KB, such that 
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P™(a()) oscillates between zero and values > 1/2 as 71 oscillates between spec{4>) and 
its complement. Note that the construction of knowledge bases with this behavior does 
not require formulas (j> with a hard spectrum as in Corollary 13. 4J and is not contingent 
on NETIME ^ ETIME. Already a knowledge base as constructed in the proof of 
Theorem |4. 1 I with </) replaced by tp of Example |3.2| will show this behavior 

The reason behind these different convergence properties lies in a somewhat differ- 
ent role that conditioning on evidence plays in process-oriented and constraint-based 
models: in the former, a conditional probability P*^(a() | b{)) defined by a model M 
can, in general, not be defined as an unconditional probaility P*^ (a()) in a modified 
model M'. For constraint knowledge bases KB, on the other hand, one can just add to 
KB the hard constraint -.6() : to obtain KB' with P,f ^' ^ P,f^ | b{). Thus, there 
are here no fundamental differences between conditional and unconditional probabilis- 
tic queries. For procedural models, on the other hand, this difference is instrumental 
for the convergence of query probabilities: such a convergence only is guaranteed for 
unconditional queries, and can easily fail for conditional ones. 

6 Conclusion 

We have shown that for currently quite popular relational probabilistic models consist- 
ing of collections of weighted, quantifier- and function-free formulas there is likely to 
be no general polynomial lifted inference method (contingent on NETIME 7^ ETIME). 
Somewhat surprisingly, this even holds for approximate inference. Between this neg- 
ative result, and the positive result of [125 1, there still could be a lot of room for iden- 
tifying tractable fragments by restricting 0-RFOL further via limits on the number of 
variables, or the richness of the signature S. 

A Proofs 

Proof of Proposition |4]2t 

We begin by defining the term-depth of a term t in the signature as the maximal 
nesting depth of function symbols in t. Precisely, we define inductively: if t = x, then 
t has term depth 0. If i = /() (a constant), or t = f{xi , . . . ,Xk) (a function term with 
only variables as arguments), then t has term depth 1. If t = f{ti, . . . , tk), then the 
term depth of t is one plus the maximal term depth of the ti. 

The term depth of a formula 0(a;) is the maximal term depth of the terms it contains. 

We now show that every formula (j){x) of term depth I can be transformed into a 
formula ^'^^a;, z) of term depth Z - 1 in 0-FOL(S'US'-^US'+), such that the statement 
for(j)+ of the proposition holds for (f>^^'^ (but with SUS^US+ instead of 5 U 5"+ inii). 
The proposition then follows by defining 0+ as the result of iteratively applying I such 
transformations to (p. Since the term depth of the resulting (/)+ is zero, then actually 
(l)+ix,z) eO-RFOL(S' U 5*+). 

Let {fi{xi) I i = 1, . . . , r} be the set of all distinct terms (including sub-terms) of 
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depth 1 appearing in Let zi, . . . , be new variables. Define 0^ ^ (cc, z) as 

r 

/y R^'{xi,Zi) 0(a;)[zi//i(a;i),Zr/ . . . ,/r(a;r)] 

i=l 

To now show i=»ii let / e 5 U S*^) with / ^ Va;(/«(a;). Define /+ e X{n, S U 
U 5+) as the expansion of / in which each G 5+ is interpreted as the relational 

representation of /, i.e., /+ ^ Rf{d,e) iff / ^ /(d) — e. Clearly, /+ ^ Func. 

Furthermore, the following are equivalent: 

/ \= \^x(t){x) 

I \= 'ixz A[=i fi{xi) = Zi 

<t>{x)[zi/fi{xi),Zr/ ...Jr{Xr)] 

1+ ^yxz/\l^^Rf^ix,,z,) 

(t>ix)[zi/flixi), . . .,Zr/friXr)] 

For ii=J>i let as in ii be given. Since /+ ^ Func, we can turn into an inter- 
pretation for S U by defining f{d) as the unique e for which R^{d, e) holds in /+. 
Then, by the same equivalences as above, |= \/xz 0^(a;, z) implies / \= \/x4){x). 

□ 

Proof of Theorem |43t Let (f) eRFOL as given by Corollary |l4l and Va; (j)'^'^'"'' (x) 
its relational Skolemization. Let S be the original signature of </>, and 5*+ the relation 
symbols introduced in the relational Skolemization. Furthermore, for each fc-ary i?"*" S 
5*+ we introduce a new (fc — l)-ary relation These new symbols will be used to 
calibrate the weight of models for the reference proposition b{). Note that the arity of 
symbols in is at least 1, and thus, is well-defined, but may contain relations 
of arity 0. 

The first formula in our knowledge base is 

a{) A^(j)'^-^'"''{x) : (11) 

We now approximately axiomatize the functional nature of the symbols i?+ G S*^. 
The sentence (|6]) can be directly encoded as a weighted formula: 

R+ix,y)AR+ix.,y')Ay^y' : (12) 

Next, we would Uke to enforce (|7]i by means of a weighted formula. However, 
(|7| encodes the essence of the existential quantifiers we are about to eliminate, and, 
thus, it is not surprising that this is not possible to enforce strictly. However, we can 
reward models in which the existential quantification of O is satisfied via the weighted 
formulas 

a() Ai?+(£c,y) : w {R+ e S+) (13) 

where > 1 is a weight whose exact value is to be defined later 

We now proceed with constraining models of the reference proposition b{). First, 

b{) A R{x) : {ReS) (14) 
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bOAR+{x,y) : {R+ e S+) (15) 
In order to allow &()-models to gain some weight, we use the extra symbols in S^~^: 



bOAR++{x) : w {R++eS++) (16) 

where w is the same weight as in ST3[ . To limit the possible interpretations of b{)- 
models, we also stipulate: 

b{)A^R++{x) : {R++eS++) (17) 

The extra symbols must have empty interpretations in a()-models: 

a{)AR++{x) : G 5++) (18) 

Finally, we add: 

-(a()V6()) : (19) 

We now determine (approximately) w^^{a{)) and w^^{b{)) for the cases n G 
spec{(j)) and n ^ spec((j)). 

First, consider 6() : for any n, there exists exactly one interpretation /^q £ I{Dn , SU 
5*+ U 5*++ U {a(), 6()}) in which 6() is true. This is the interpretation in which all re- 
lations in 5 U 5+ are empty ((fT4ll.(fT5]l). all relations in 5++ are maximal (fTTI l. and, in 
consequence of the latter, because of (fTSl l. a() is false. 

Assume that 5*+ = {R^ , . . . , i?+ }, where R+ has aiity h + 1- Then 6 S++ 
contributes a factor of uj" ' to ti;^*(6()), and the total weight is: 

(60) - (40) = = w^^-\ (20) 

using for abbreviation K{n) := n'^i + • • • + n*''™. 

We next turn to wfj^{a{)) in the case n G spec{4>). Then there exists at least one 
interpretation / G I{Dn, S U S^), in which \/x 0* '^^'''(a;) is true, and in which the 
relations from 5+ have a functional interpretation. We can expand this interpretation 
to an interpretation in 2{n, S U 5*+ U 5*++ U {a(), 6()}) by giving all relations in 5*++ 
an empty interpretation, and setting a() to true and b{) to false. Then / does not violate 
any hard constraint in KB, and collects from ( fT3T l a total weight of w"*^^ ^ " . Thus 

<^(a()) > 

and therefore, when n G spec{(j)) 

P^'iaO) > (a()) > 1/2. 

Finally, we have to consider w^*(a()) in the case n ^ spec{(p). For any / with 
nonzero weight in which a() is true, because of (fTTT i. also Va;(/)* '^^'''(a;) must be true. 
This, now, only is possible when some i?+ G 5+ is not a functional relation, which, 
because of (fTZt can only mean that for some a; there exists no y with R^{x, y). 
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For a given J, let 

:= ^ I {d e D'^' I 3el h e)} | (21) 

1=1 

In the case where ki — 0, the summation term on the right of (l2ll is 1 if i?^(e) holds 
for some e, and otherwise. Thus, 1{I) counts the number of d that substituted for x in 
(fT3T l contribute a weight factor w to The weight of / with / |= a() then is ui''^-* 
(noting that because of ( fTSl l / cannot obtain any additional weight from interpretations 
of relations). 

We now count the number of interpretations with the precise weight collected 
from (fTsT l. There are n^'-"-' different d, i?+-combinations R^{d, •) that can lead to an 

increment of 1 to the sum (ISTT i. Thus, there are (" ^ ') different selections of R'^{d, •) 
to obtain a sum of I. For each such selection, there are different choices for e to 
make R'^{d, e) true. Thus, there is a total of (" ^ )n' different interpretations of the 
relations in 5*+ to obtain a weight of w'. For each such interpretation in which a() also 
is true, the relations of 5++ are empty, and 6() is false. However, we still have to take 
intor account possible interpretations of the relations in S. If L{n) is the total number 
of ground S'-atoms R{d) (R G 5), then there are 2^^"^ different interpretations for S. 
L{n), like K{n), is a polynomial in n. Thus, we can bound 

K(n)-1 . K{n)\ 

(«())< E \ I )"'2^'")u;' (22) 

i=0 ^ ^ 

where the sum is over all possible values of / in which at least one symbol in i?+ does 
not have a functional interpretation. Note that the right side of (|22] | may give a rather 
extreme over-estimate of w™{a{y), since most of the interpretations of S* U 5+ that are 
counted here with a weight of v} may make -10* ■^'^"'(d) true for some d, and, thus, by 
(fTTT i have an actual weight of 0. 

By further lower-bounding the right side of (l22t . we obtain 

K{n)-\ 
1=0 

where M{n) = {K{n) — l)K{n)) + {K{n) — 1) + L{n) is a polynomial in n. We now 
obtain for the case n ^ spec{4>) 

P^\a{)) < <*(a())/u;f (60) 

Setting w = 10n*^("), we thus have P,f*(a()) < 1/10 if n ^ i/:>ec(0). The repre- 
sentation size of w is polynomial in n. Thus, an algorithm that computes P^^{a{)) up 
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to an accuracy of 0.2 = (0.5 — 0.1)/2 in time polynomial in n and the representation 
size of w would give a polynomial time decision procedure for spec{(j)). □ 
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